Model Selection

Mixture of Experts

# Mixture of Experts

Qwen3 235B A22B GPTQ Int4

Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture-of-experts (MoE) models. Through extensive training, Qwen3 has achieved groundbreaking progress in reasoning, instruction following, agent capabilities, and multilingual support.

Large Language Model

Qwen3 235B A22B FP8 Dynamic

The FP8 quantized version of the Qwen3-235B-A22B model, which effectively reduces GPU memory requirements and improves computational throughput, suitable for various natural language processing scenarios.

Large Language Model

Qwen3 30B A3B 128K GGUF

Qwen3 is the latest generation of large language models in the Tongyi Qianwen series, offering a complete system of dense and mixture-of-experts (MoE) models. Based on extensive training, Qwen3 achieves breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.

Large Language Model English

Qwen3 235B A22B 128K GGUF

Qwen3 is the latest generation large language model in the Tongyi Qianwen series, offering a complete suite of dense and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 has achieved breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.

Large Language Model English

Qwen3 235B A22B GGUF

Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture of experts (MoE) models. Based on extensive training, Qwen3 has achieved breakthrough progress in reasoning, instruction following, agent capabilities, and multilingual support.

Large Language Model English

Mmrexcev GRPO V0.420

This is a pre-trained language model merged using the SLERP method, combining the characteristics of both Captain-Eris_Violet-GRPO-v0.420 and MMR-E1 models.

Large Language Model

Llama4some SOVL 4x8B L3 V1

This is a Mixture of Experts model obtained by merging multiple pre-trained language models using mergekit, aiming to create the most unconstrained text generation capability.

Large Language Model

Chicka Mixtral 3x7b

A Mixture of Experts large language model based on 3 Mistral architecture models, excelling in dialogue, code, and mathematical tasks

Large Language Model

Llama 3 Smaug 8B GGUF

GGUF format quantized model based on abacusai/Llama-3-Smaug-8B, supporting 2-8 bit quantization levels, suitable for text generation tasks

Large Language Model

Copus-2x8B is a Mixture of Experts model based on the Llama-3-8B architecture, combining fine-tuned versions of dreamgen/opus-v1.2-llama-3-8b and NousResearch/Meta-Llama-3-8B-Instruct.

Large Language Model

lodrick-the-lafted

Wizardlm 2 8x22B

WizardLM-2 8x22B is the state-of-the-art Mixture of Experts (MoE) model developed by Microsoft's WizardLM team, with significant performance improvements in complex dialogues, multilingual tasks, reasoning, and agent tasks.

Large Language Model

Wizardlm 2 8x22B

WizardLM-2 8x22B is the next-generation state-of-the-art large language model developed by Microsoft AI, featuring a Mixture of Experts (MoE) architecture, excelling in complex dialogue, multilingual capabilities, reasoning, and agent tasks.

Large Language Model

Zephyr Orpo 141b A35b V0.1

Zephyr 141B-A39B is a large language model fine-tuned from Mixtral-8x22B-v0.1, trained using the ORPO alignment algorithm, designed to be a helpful assistant.

Large Language Model

Phalanx 512x460M MoE

LiteLlama-460M-1T is a lightweight mixture of experts model with 512 experts, suitable for efficient inference and text generation tasks.

Large Language Model

Transformers English

Beyonder 4x7B V2

Beyonder-4x7B-v2 is a large language model based on the Mixture of Experts (MoE) architecture, consisting of 4 expert modules, each specializing in different domains such as dialogue, programming, creative writing, and mathematical reasoning.

Large Language Model

Dolphin 2.7 Mixtral 8x7b AWQ

Dolphin 2.7 Mixtral 8X7B is a large language model based on the Mixtral architecture, focusing on code generation and instruction-following tasks.

Large Language Model

Transformers English

Dolphin 2.5 Mixtral 8x7b GPTQ

Dolphin 2.5 Mixtral 8X7B is a large language model developed by Eric Hartford based on the Mixtral architecture, fine-tuned on multiple high-quality datasets, suitable for various natural language processing tasks.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase